Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ImagePullSecret for pulling from private registry #584

Open
wants to merge 29 commits into
base: main
Choose a base branch
from

Conversation

gniltaws
Copy link

What is this PR About?

Adds the ability to specify an image pull secret for pulling a custom image from a private container registry

How do we test this?

It should run as-is if pulling the default image, since it only adds the secret if installPlanApproverAndVerifyJobsImagePullSecret is specified.

Test that the secret works requires

  • Update values.yaml installPlanApproverAndVerifyJobsImage to point at to an image in a private registry
  • Create an imagepull secret with credentials for the private registry
  • Update values.yaml installPlanApproverAndVerifyJobsImagePullSecret

cc: @redhat-cop/day-in-the-life

@itewk
Copy link
Contributor

itewk commented Nov 21, 2024

@gniltaws would you be willing to add a test to https://github.com/redhat-cop/helm-charts/tree/main/charts/operators-installer/_integration-tests as part of this?

I am working to get permissions so i can approve the workflow running, but you can run the tests locally too

@gniltaws
Copy link
Author

@itewk The default image pulls from private registry registry.redhat.io, so I think the only thing that's needed to test this update is to add a Pull Secret to the namespace and put it's name in the values file.

Does this repository have a way to handle secret data (like a token for registry.redhat.io)? If not the rest of this comment is moot.

The simplest way I can see to test this is to modify the existing _integration-tests/test-install-operator-1-automatic-intermediate-manual-upgrades-values.yaml that does the upgrade to use the default image from the private registry and add a installPlanApproverAndVerifyJobsImagePullSecret. That would test the pull secret and the update functionality at the same time. I'm new to automated testing like this, so I'm not sure if testing multiple things in the same step is a good idea.

@garethahealy
Copy link
Contributor

@itewk The default image pulls from private registry registry.redhat.io, so I think the only thing that's needed to test this update is to add a Pull Secret to the namespace and put it's name in the values file.

Does this repository have a way to handle secret data (like a token for registry.redhat.io)? If not the rest of this comment is moot.

no, we don't currently have a secret for registry.redhat.io but if you build a test in your fork, I'll get a token created in this repo.

@garethahealy
Copy link
Contributor

garethahealy commented Nov 22, 2024

echo "${{ secrets.REDHAT_REGISTRY_TOKEN }}" | podman login registry.redhat.io --username "6340056|redhat-cop-helm-charts" --password-stdin
Login Succeeded!

if you want it as a k8s Secret, let me know, and I'll create another secret for that

@itewk
Copy link
Contributor

itewk commented Nov 22, 2024

@garethahealy @gniltaws rather then putting a token in this repo which gives access to red hat private registry, which puts us at risk of violating some rules about use of that repo...why not create a quay.io repo and add a pull secret to that?

@garethahealy
Copy link
Contributor

@garethahealy @gniltaws rather then putting a token in this repo which gives access to red hat private registry, which puts us at risk of violating some rules about use of that repo...why not create a quay.io repo and add a pull secret to that?

I don't see the difference. We are not redistributing any of the redhat.io images, which would be a concern for the RH usage agreement. All this is doing is pulling and testing the image, and once the test is finished, throwing away the environment.

Creating a mirror (which I think is what you are suggesting), makes the situation worse, as we are actually redistributing the image then.

@itewk
Copy link
Contributor

itewk commented Nov 22, 2024

Creating a mirror (which I think is what you are suggesting), makes the situation worse, as we are actually redistributing the image then.

I wouldn't mirror one of the redhat.io images but rather have this test test against some other image that is already publically avaiable

@garethahealy
Copy link
Contributor

garethahealy commented Nov 22, 2024

I wouldn't mirror one of the redhat.io images but rather have this test test against some other image that is already publically avaiable

I see a few issues with that:

  • chart is packaged and released with a custom image ref
  • is the custom image maintained? and who by?
  • if customers or RH services were to use this chart, task one is to replace that reference with the redhat.io image

Feels like that's causing more issues, than the concern of pulling from the redhat.io registry...

@itewk
Copy link
Contributor

itewk commented Nov 22, 2024

the chart woudldn't be packaged with a custom image. the test only would be set to reference the custom image/repo, which the tests actually already do, for precisly this reason, so the tests can run without access to redhat.io (https://github.com/redhat-cop/helm-charts/blob/main/charts/operators-installer/_integration-tests/test-install-operator-0-automatic-intermediate-manual-upgrades-values.yaml#L3)

Really we just need the test to set the secret. we don't really need to verify the secret actually works. you could copy that exact test above and just add the parameter to set the secret.

But anyway, if you all want to have the pull secret for redhat.io here. okay.

@itewk itewk added the enhancement New feature or request label Nov 22, 2024
@itewk itewk self-requested a review November 22, 2024 17:56
@garethahealy
Copy link
Contributor

linting failing due to:

charts/operators-installer/values.yaml
  Error: 16:77 [trailing-spaces] trailing spaces

@gniltaws
Copy link
Author

I don't understand why the secret creation is failing in the integration test. Am I specifying the secret incorrectly? @garethahealy

error: either --from-file or the combination of --docker-username, --docker-password and --docker-server is required
$ grep -e 'docker-username' -e 'docker-password' -e 'docker-server' .github/workflows/install-integration-tests-operators-installer.yaml 
          --docker-password="${{ secrets.REDHAT_REGISTRY_TOKEN }}" \
          --docker-server=registry.redhat.io \
          --docker-username="6340056|redhat-cop-helm-charts"

@garethahealy
Copy link
Contributor

it'll be because your fork/pull request cant access the secret:

so this test would only work on the main branch

@itewk
Copy link
Contributor

itewk commented Dec 2, 2024

it'll be because your fork/pull request cant access the secret:

so this test would only work on the main branch

it doens't h ave to be on main branch, it just as to be a PR from a branch on this repo rather then a fork of this repo.

so optinos

  1. update test to pull from a dummy quay.io repo with a dummy read only robot accoutn where we can just put the access token write in the test rather then uusing github secrets
  2. take the branch for this PR and put it on this repo and re-do the PR from that branch
  3. YOLO it and assume it will pass when we merge, and if it doesn't do another merge

you all have any preferences?

i am fine with any of it

@gniltaws
Copy link
Author

gniltaws commented Dec 3, 2024

I think strategy number 2 is the best strategy, but I don't want to make more work for you.

I'm a little nervous about option 3 (YOLO strategy) working the first time, but I think it would only need minor changes, if anything. If it goes poorly, issue a bugfix and remove the tag for the bad version?

Thoughts? @garethahealy @itewk

@gniltaws gniltaws requested a review from garethahealy December 6, 2024 16:49
@sabre1041
Copy link
Contributor

The reliance on content contained in GitHub secrets presents a challenge not only in the management of credentials, but limitations on the testing aspects within pull requests.

There are two approaches that I might suggest. Regardless of these approaches, I would not maintain an actual, valid credential to any registry. As a result, here are two approaches:

  1. Include credentials to a dummy repository. The image pull secret will be added, but will not actually get used
  2. Spin up a registry:2 registry in the kind cluster, secure it with htpasswd and then copy the publicly available cli image to it. The credentials that is used to secure it does not need to be stored as a GitHub secret since it just contains an already publicly available image.

@gniltaws
Copy link
Author

First, a big thank-you to @sabre1041! That saved me so much trial and error! Thanks also to @itewk for the guidance.

I added my work-in-progress so I could ask a few questions on tooling:

  1. Is skopeo available in the container where the integration testing happens? If not, what's the preferred tool for copying the image into the private repo?
  2. I got as far as I could without adding an Ingress (trying to avoid since I wasn't sure what should go in place of "registry.<ip_address>.nip.io"), but I realize it's needed in place of the registry.registry.svc.cluster.local when pushing/ pulling the image to/from the private repository. What should I use for the host in the Ingress?

@sabre1041
Copy link
Contributor

First, a big thank-you to @sabre1041! That saved me so much trial and error! Thanks also to @itewk for the guidance.

I added my work-in-progress so I could ask a few questions on tooling:

  1. Is skopeo available in the container where the integration testing happens? If not, what's the preferred tool for copying the image into the private repo?
  2. I got as far as I could without adding an Ingress (trying to avoid since I wasn't sure what should go in place of "registry.<ip_address>.nip.io"), but I realize it's needed in place of the registry.registry.svc.cluster.local when pushing/ pulling the image to/from the private repository. What should I use for the host in the Ingress?
  1. Assuming we are using any of the standard GitHub Actions Ubuntu images, skopeo, buildah and podman are all installed
  2. If you are just pushing from the host itself, set the ingress with the hostname registry.localhost. It will resolve locally and route into the kind cluster

@itewk
Copy link
Contributor

itewk commented Dec 13, 2024

@gniltaws now to clean up the pipeline jobs :)

thanks for sticking with this! i know we made it much more complicated on you. but this thing is used in production at many a client so testing important.

@gniltaws
Copy link
Author

Is someone able to trigger the tests again? @itewk @garethahealy

@gniltaws
Copy link
Author

If the skopeo command in this latest version doesn't work, I'm going to need some advice on using skopeo and/or Ingress in a kind cluster

@sabre1041
Copy link
Contributor

If the skopeo command in this latest version doesn't work, I'm going to need some advice on using skopeo and/or Ingress in a kind cluster

What specifically?

@gniltaws
Copy link
Author

If the skopeo command in this latest version doesn't work, I'm going to need some advice on using skopeo and/or Ingress in a kind cluster

What specifically?

How to address the registry in the kind cluster in the skopeo copy command. When I used this:

  skopeo copy
    --dest-creds ${registry_user}:${registry_password} \
    docker://$(awk '$1 ~/Image:/ {print $2}' charts/operators-installer/_integration-tests/test-install-operator-0-automatic-intermediate-manual-upgrades-values.yaml) \
    docker://registry.localhost/origin-cli

I got this error: pinging container registry registry.localhost: Get \"https://registry.localhost/v2/\": dial tcp [::1]:443: connect: connection refused

I tried adding the port on (docker://registry.localhost/origin-cli:5000) and switching from docker:// to https:// and got the same error. This time I'm trying http://, but I'm not sure what to do if that doesn't work. I don't have experience with non-OpenShift k8s (and thus Ingresses) and skopeo.

@sabre1041
Copy link
Contributor

If the skopeo command in this latest version doesn't work, I'm going to need some advice on using skopeo and/or Ingress in a kind cluster

What specifically?

How to address the registry in the kind cluster in the skopeo copy command. When I used this:

  skopeo copy
    --dest-creds ${registry_user}:${registry_password} \
    docker://$(awk '$1 ~/Image:/ {print $2}' charts/operators-installer/_integration-tests/test-install-operator-0-automatic-intermediate-manual-upgrades-values.yaml) \
    docker://registry.localhost/origin-cli

I got this error: pinging container registry registry.localhost: Get \"https://registry.localhost/v2/\": dial tcp [::1]:443: connect: connection refused

I tried adding the port on (docker://registry.localhost/origin-cli:5000) and switching from docker:// to https:// and got the same error. This time I'm trying http://, but I'm not sure what to do if that doesn't work. I don't have experience with non-OpenShift k8s (and thus Ingresses) and skopeo.

Steps to set up port mapping to expose 80/443 are found here

https://kind.sigs.k8s.io/docs/user/ingress

@gniltaws
Copy link
Author

I think I figured out how to expose the registry so skopeo can use it. Could someone trigger the tests again?

@gniltaws
Copy link
Author

The error should be fixed now, is someone able to trigger the tests again?

@gniltaws
Copy link
Author

Thank you! I finally found the issue that I had been blind to. Could you run the tests workflow again?

@gniltaws
Copy link
Author

The integration test only failed because it hit the timeout after 30 minutes, not because of a yaml error or something like that.

Is there a way to see more detail about what went wrong, or is the best course of action just to increase the timeout on the update job?

@gniltaws
Copy link
Author

It still timed out after 30 minutes despite my change to 35 minutes. Any ideas?

@gniltaws
Copy link
Author

gniltaws commented Jan 6, 2025

It still timed out after 30 minutes despite my change to 35 minutes. Any ideas?

Is there a hard limit of 30 minutes, or do I need to change it in a different way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants